Daghestanian loans database

Authors: Ilya Chechuro, Michael Daniel, and Samira Verhees.

This database contains wordlists collected as part of the Daghestanian loans project by the Linguistic Convergence Laboratory at NRU HSE. The aim of the 160-item shortlist, which is based on the World Loanword Database questionnaire, is to measure lexical contact on a micro-level. In other words, to quantify lexical convergence among the speech communities of minority languages on a village-level, and to detect fine-grained areal patterns beyond general observations on the spheres of influence of certain languages.

Contents:

              [,1]
target_words 25796
languages       23

How to cite this project

If you use data from the database in your research, please cite as follows:

Chechuro I., Daniel M., Dobrushina N., and Verhees S. 2019. Daghestanian loans database. Linguistic Convergence Laboratory, HSE. (Available online at https://lingconlab.github.io/Dagloan_database/DL_database.html, DOI, accessed on May 12, 2019.)

The database

For now, the table shows source Concepts and target Words. Each target word is grouped in a similarity Set - a set of words that have the same meaning and look similar. In the future, data will be added on borrowing sources. Metadata includes the name of the Village where the word was recorded, the administrative District it is part of, the Language spoken there, and the List ID: these ID’s correspond to a particular speaker or in some cases a written source like a dictionary. Data is accessible at: Github/LingConLab/DagloanDatabase.
The dataset in the dummy format is available here.


Version: 2019-05-12. For questions or comments contact jh.verhees@gmail.com.


Map of the surveyed villages

Hover over and / or click on a dot on the map to know more. The color of the dots corresponds to the number of lists collected in a village. Orange = dictionary data.

Sample lexical map

The map below shows the distribution of different stems for the concept ‘pepper’.

Sources of lexical influence

Cluster Dendrogram of Foreign Influence

This tree is built as follows. 0 distance is given only to two matching non-empty cells, otherwise the distance is 1. The NA’s are not counted.

     Speaker Language Village District Alibeglo1 Arkhit1 Arkhit2 Arkhit3
     Arkhit4 Arkhit5 Arkhit6 Bezhta1 Darvag1 Darvag2 Darvag3 Darvag4
     Darvag5 Darvag6 Dyubek1 Dyubek2 Dyubek3 Dyubek4 Dzhavgat1 Dzhavgat2
     Dzhavgat3 Dzhavgat4 Dzhibakhni1 Dzhibakhni2 Dzhibakhni3 Dzhibakhni4
     Helmets1 Helmets2 Helmets3 Ikhrek1 Ikhrek2 Ikhrek3 Ikhrek4 Ilisu1
     Karata1 Karata2 Karata3 Karata4 Khapil1 Khapil2 Khapil3 Khapil4
     Khapil5 Khiv1 Khiv2 Khiv3 Khiv4 Khlut1 Khlut2 Khlut3 Khlut4 Khlut5
     Khoredzh1 Khoredzh2 Khoredzh3 Khoredzh4 Khoredzh5 Khoredzh6 Khutkhul1
     Khutkhul2 Khutkhul3 Khutkhul4 Kiche1 Kiche2 Kidero1 Kidero2 Kidero3
     Kina1 Kina2 Kina3 Kurag1 Kusur1 Laka1 Laka2 Laka3 Laka4 Laka5 Laka6
     Meshabash1 Meshabash2 Mikik1 Mikik2 Qax1 Qax2 Qax3 Qax4 Qax5 Qax6
     Qax7 Qax8 Qax9 Qum1 Qum2 Rikvani1 Rutul1 Tad-Magitl1 Tad-Magitl2
     Tatil1 Tatil2 Tatil3 Tatil4 Tatil5 Tlibisho1 Tlibisho2 Tlibisho3
     Tlibisho4 Tpig1 Tsinit1 Tsinit2 Tsinit3 Tsinit4 Tsinit5 Tukita1
     Yagdyg1 Yagdyg2 Yagdyg3 Yagdyg4 Yagdyg5 Yagdyg6 Yersi1 Yersi2 Yersi3
     Yersi4 Zilo1 Zilo2
 [ reached 'max' / getOption("max.print") -- omitted 125 rows ]

Cluster Dendrogram of Foreign Influence (Strict Distances)

This tree is built as follows. 0 distance is given only to two matching non-empty cells, otherwise the distance is 1. This leads to the huge distances even if speakers are similar. The NA’s are counted.

     Speaker Language Village District Alibeglo1 Arkhit1 Arkhit2 Arkhit3
     Arkhit4 Arkhit5 Arkhit6 Bezhta1 Darvag1 Darvag2 Darvag3 Darvag4
     Darvag5 Darvag6 Dyubek1 Dyubek2 Dyubek3 Dyubek4 Dzhavgat1 Dzhavgat2
     Dzhavgat3 Dzhavgat4 Dzhibakhni1 Dzhibakhni2 Dzhibakhni3 Dzhibakhni4
     Helmets1 Helmets2 Helmets3 Ikhrek1 Ikhrek2 Ikhrek3 Ikhrek4 Ilisu1
     Karata1 Karata2 Karata3 Karata4 Khapil1 Khapil2 Khapil3 Khapil4
     Khapil5 Khiv1 Khiv2 Khiv3 Khiv4 Khlut1 Khlut2 Khlut3 Khlut4 Khlut5
     Khoredzh1 Khoredzh2 Khoredzh3 Khoredzh4 Khoredzh5 Khoredzh6 Khutkhul1
     Khutkhul2 Khutkhul3 Khutkhul4 Kiche1 Kiche2 Kidero1 Kidero2 Kidero3
     Kina1 Kina2 Kina3 Kurag1 Kusur1 Laka1 Laka2 Laka3 Laka4 Laka5 Laka6
     Meshabash1 Meshabash2 Mikik1 Mikik2 Qax1 Qax2 Qax3 Qax4 Qax5 Qax6
     Qax7 Qax8 Qax9 Qum1 Qum2 Rikvani1 Rutul1 Tad-Magitl1 Tad-Magitl2
     Tatil1 Tatil2 Tatil3 Tatil4 Tatil5 Tlibisho1 Tlibisho2 Tlibisho3
     Tlibisho4 Tpig1 Tsinit1 Tsinit2 Tsinit3 Tsinit4 Tsinit5 Tukita1
     Yagdyg1 Yagdyg2 Yagdyg3 Yagdyg4 Yagdyg5 Yagdyg6 Yersi1 Yersi2 Yersi3
     Yersi4 Zilo1 Zilo2
 [ reached 'max' / getOption("max.print") -- omitted 125 rows ]

Mediation of Turkic influence (Speakers)

Mediation of Turkic influence (Villages)

Mediation of Total Turkic Influence

Mediation of Standard Azerbaijani Influence

Mediation of Turkic Influence via Major Languages

    Speaker Language  Village District        Lexeme Present
1 Alibeglo1 Georgian Alibeglo      Qax the_beeswax_9       0
2   Arkhit1  Lezgian   Arkhit     Khiv the_beeswax_9       0
3   Arkhit2  Lezgian   Arkhit     Khiv the_beeswax_9       0
4   Arkhit3  Lezgian   Arkhit     Khiv the_beeswax_9       0
5   Arkhit4  Lezgian   Arkhit     Khiv the_beeswax_9       0
6   Arkhit5  Lezgian   Arkhit     Khiv the_beeswax_9       0

Ilya Chechuro, Michael Daniel, Samira Verhees

2019-05-12